Sharc: Managing CPU and Network Bandwidth in Shared Clusters

ثبت نشده
چکیده

In this paper, we argue the need for effective resource management mechanisms for sharing resources in commodity clusters. To address this issue, we present the design of Sharc—a system that enables resource sharing among applications in such clusters. Sharc depends on single node resource management mechanisms such as reservations or shares and extends the benefits of such mechanisms to clustered environments. We present techniques for managing two important resources—CPU and network interface bandwidth—on a cluster-wide basis. Our techniques allow Sharc to (i) support reservation of CPU and network interface bandwidth for distributed applications, (ii) dynamically allocate resources based on past usage, and (iii) provide performance isolation to applications. Our experimental evaluation has shown that Sharc can scale to 256 node clusters running 100,000 applications. These results demonstrate that Sharc can be an effective approach for sharing resources among competing applications in moderate size clusters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Characterizing NAS Benchmark Performance on Shared Heterogeneous Networks

The goal of this research is to develop performance profiles of parallel and distributed applications in order to predict their execution time under different network conditions. This paper measures the resource requirements of the NAS benchmark programs and characterizes their performance in a shared heterogeneous environment. The programs in the benchmark suite were executed on a controlled t...

متن کامل

Comparison of Parallel Programming Models on Clusters of SMP Nodes

Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distributed memory parallelization on the node interconnect with the shared memory parallelization inside of each node. Various hybrid MPI+OpenMP programming models are compared with pure MPI. Benchmark results of several platforms are presented. This paper analyzes the strength and weakness of several p...

متن کامل

Cooperative Scheduling of Multiple Resources

Obtaining simultaneous and timely access to multiple resources is known to be an NP-complete problem [10]. Complete resource decoupling is, therefore, often used for managing end-to-end delays in distributed real-time systems where each processor is scheduled independent of the others. This decoupling approach unfortunately fails when multiple resources must be managed within a single node. Res...

متن کامل

Making the best of a bad situation: Prioritized storage management in GEMS

As distributed storage systems grow, the response time between the occurrence of a fault, detection, and repair becomes significant. Systems built on shared servers have additional complexity because of the high rate of service outages and revocation. Managing high replica counts in this environment becomes very costly in terms of the storage required and bandwidth consumption for file copies. ...

متن کامل

Heterogeneous System Coherence

Many future heterogeneous systems will integrate CPUs and GPUs physically on a single chip and logically connect them via shared memory to avoid explicit data copying. Making this shared memory coherent facilitates programming and fine-grained sharing, but throughput-oriented GPUs can overwhelm CPUs with coherence requests not well-filtered by caches. Meanwhile, region coherence has been propos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006